-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable LTO optimization by default for runtime releases. #16811
Conversation
@ScottTodd I expect that the change of the linking mode has effectively just fuzzed execution a bit and caused a latent bug to surface in the tooling. I'm going to squash this branch for easy access but could use your help to resolve/land. |
This is done by generalizing the primordial `IREE_SIZE_OPTIMIZED` flag into a `IREE_RUNTIME_OPTIMIZATION_PROFILE` that: * Can enable 'lto' or 'size'. * Is scoped to just the runtime targets. * Minimally does the right thing for 'size' on Linux vs just on Windows (not the goal of this patch but drops ~300KB from binary sizes when enabled). The compile time delta for a clean build of the runtime in full LTO vs regular mode was not measured precisely but is in the noise (i.e. <1m). As such, just enabling by default for Python release binaries. Others can be enabled via: `-DIREE_RUNTIME_OPTIMIZATION_PROFILE=lto`, which is recommended for benchmarking, etc. Progress on #898.
f0ac6f9
to
5b96e90
Compare
Interesting. I can help tomorrow. A few thoughts from a quick look:
|
Retried the failing jobs... same test failures on both CPU and Vulkan. So could mark as XFAIL to unblock this. Should still investigate though. |
On my Windows machine I see the compiler crashing on these test cases at tip of tree and with this PR:
So the CPU failures on CI here are strange... Could skip them instead of XFAIL to keep the same set of tests running across platforms. |
Would really help to have asserts enabled in the pkgci build that these tests use. That is disabled right now: https://github.com/openxla/iree/blob/ee32fc7b74b4aad49fc46d67ea6c1f617416c4d4/.github/workflows/pkgci_build_packages.yml#L81-L83 |
Fwiw, I think I fixed the issue behind that todo un another setup. But yeah. Ok, I'm taking all of this as orthogonal to this patch (but still needs addressing). How do I mark those tests? (Sorry, running a few things here...) |
https://github.com/openxla/iree/tree/main/experimental/regression_suite/external_test_suite the CPU and Vulkan JSON files. Can add to either CPU should add these:
Vulkan should add these:
|
Okay, CI is happy now. |
This is done by generalizing the primordial `IREE_SIZE_OPTIMIZED` flag into a `IREE_RUNTIME_OPTIMIZATION_PROFILE` that: * Can enable 'lto' or 'size'. * Is scoped to just the runtime targets. * Minimally does the right thing for 'size' on Linux vs just on Windows (not the goal of this patch but drops ~300KB from binary sizes when enabled). The compile time delta for a clean build of the runtime in full LTO vs regular mode was not measured precisely but is in the noise (i.e. <1m). As such, just enabling by default for Python release binaries. Others can be enabled via: `-DIREE_RUNTIME_OPTIMIZATION_PROFILE=lto`, which is recommended for benchmarking, etc. Note that this removes the use of the CMake option `IREE_SIZE_OPTIMIZED`. It was never even declared properly as an option and didn't do the same class of thing across Windows/Linux. This has been fixed and it can be enabled via `-DIREE_RUNTIME_OPTIMIZATION_PROFILE=size`. Note that as on Windows, this implies LTO. If old behavior without LTO is desired, we can add a profile for that. Progress on #898. --------- Co-authored-by: Scott Todd <[email protected]>
-DIREE_SIZE_OPTIMIZED=ON \ | ||
-DIREE_FORCE_LTO_COMPAT_BINUTILS_ON_LINUX=size \ | ||
-DIREE_FORCE_GCC_BINUTILS_ON_LINUX=ON \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been
- -DIREE_SIZE_OPTIMIZED=ON \
- -DIREE_FORCE_LTO_COMPAT_BINUTILS_ON_LINUX=size \
- -DIREE_FORCE_GCC_BINUTILS_ON_LINUX=ON \
+ -DIREE_RUNTIME_OPTIMIZATION_PROFILE=lto" \
+ -DIREE_FORCE_LTO_COMPAT_BINUTILS_ON_LINUX=ON" \
Fixing in #18164
Similar to #18162, this reduces the reliance on scripts in [`build_tools/cmake/`](https://github.com/iree-org/iree/tree/main/build_tools/cmake) by inlining the CMake commands used for these build configurations. Summary of changes: * Deleted `build_tools/cmake/build_runtime_small.sh` and `build_tools/cmake/build_runtime_tracing.sh` * Fixed size-optimized ("small") options used in the "small runtime" job and resolved an unused variable warning/error in `runtime/src/iree/hal/drivers/hip/dynamic_symbols.c` that slipped through. When we landed #16811, it used old CMake variables in the `build_tools/cmake/build_runtime_small.sh` script. Oops! * Renamed jobs so they all begin with `runtime` | name before | name after | -- | -- `build_test_runtime` | `runtime` `small_runtime` | `runtime_small` `tracing` | `runtime_tracing` Seeing these runtime jobs next to each other, we could expand the build matrix even further ("all platforms" x "all configurations"). Leaving that for the future. Now it's easier to see what the jobs have in common and how they differ.
This is done by generalizing the primordial
IREE_SIZE_OPTIMIZED
flag into aIREE_RUNTIME_OPTIMIZATION_PROFILE
that:The compile time delta for a clean build of the runtime in full LTO vs regular mode was not measured precisely but is in the noise (i.e. <1m). As such, just enabling by default for Python release binaries.
Others can be enabled via:
-DIREE_RUNTIME_OPTIMIZATION_PROFILE=lto
, which is recommended for benchmarking, etc.Note that this removes the use of the CMake option
IREE_SIZE_OPTIMIZED
. It was never even declared properly as an option and didn't do the same class of thing across Windows/Linux. This has been fixed and it can be enabled via-DIREE_RUNTIME_OPTIMIZATION_PROFILE=size
. Note that as on Windows, this implies LTO. If old behavior without LTO is desired, we can add a profile for that.Progress on #898.